This document is a record of my book readings, an exercise in RMarkdown, and procrastinating material.
books <- fread("C:/Users/User/OneDrive - London School of Hygiene and Tropical Medicine/Documents/books/Book1.csv")
b <- select(books, c(1:ncol(books)))
names(b) <- c("name", "start", "end", "days", "rating","type", "genre","review","link")
# next step make sure 1 day book show up as something, easy add 1
b <- b %>%
mutate(across(c(start,end),~ as.Date(.x, format = "%d/%m/%Y" )), logdays = log(days)+0.5) %>%
filter(!is.na(days)) %>%
mutate(type = as.factor(type), genre = as.factor(genre), reviewed = case_when(review != "" ~ link, review == "" ~ ""))
According to reputable sources, the country that reads the most is India with 10 hours per week, the US counts a measly six hours and Japan and Korea boast an honest four and three hours respectively. In Europe, 80% of the inhabitants of Luxembourg, only half of which are Luxembourgers, read at least one book a year. Only 30% of their fellow european-unioners in Romania claim to achieve such reading rates.
An even more reputable source has found that the average US adult reads 12 books a year. Now this seems like a bit much, but of course it does. On the one hand, surveys*, on the other, most people don’t read much at all and some other people read much more than 12 books a year resulting in very different mean and median statistics, and remember, surveys*.The average person is much more likely to read close to 4 books a year.
I have been keeping an imperfect record of the books I have read and listened to “cover to cover” since 2017 now in the formatted in the table below, some have accompanying review links.
dt <- b %>% select(-logdays, -review,-link) %>%
DT::datatable(b, filter = "top") # try next reactable functions
dt
Before we do some boring statistics. Here is a plot to hover over, can you find the Harry Potter cluster?
p <- b %>%
ggplot(aes(end,rating,
fill = type, stroke = .3, label = name, duration = days, review = reviewed)) +
geom_jitter(width = 0.45, height = 0.45,
size = b$logdays, na.rm = FALSE) +
scale_fill_viridis(discrete = TRUE) +
xlab("date finished") +
ggtitle("Reading timeline by rating, type and time taken")+
theme_bw()
ggplotly(p, tooltip = c("name", "rating", "days", "review"))
How do I compare to the average book-enjoyer?
What type and genre of book do I like the most?
Do I exhibit seasonal patterns?
yavg <- b %>% group_by(year(end)) %>%
summarize(n_books = n(), sum_days = sum(days), rating = round(mean(rating),2))
yavg %>%
reactable(.,
defaultSorted = "n_books",
defaultSortOrder = "desc",
theme = fivethirtyeight(),
columns = list(
n_books = colDef(
style = color_scales(.)
),
sum_days = colDef(
style = color_scales(.)
),
rating = colDef(
style = color_scales(.)
))
) %>%
add_subtitle("yearly averages")
tab <- pivot_wider(as.data.frame(table(year(b$end), b$type)),
id_cols = "Var1", names_from = "Var2", values_from = "Freq") %>% rename(Year = Var1)
reactable(tab, theme = fivethirtyeight()) %>% add_subtitle("yearly frequency of book types read")